Picture for Jian Luan

Jian Luan

Restoring Initial Noise Sensitivity in Text-to-Image Distillation via Geometric Alignment

Add code
Jun 01, 2026
Viaarxiv icon

Scaling, Benchmarking, and Reasoning of Vision-Language Agents for Mobile GUI Navigation

Add code
May 26, 2026
Viaarxiv icon

PixelWizard: Towards Efficient High-Fidelity Video Generation at Ultra-Large Spatial Resolution

Add code
May 25, 2026
Viaarxiv icon

SimuWoB: Simulating Real-World Mobile Apps for Fast and Faithful GUI Agent Benchmarking

Add code
May 24, 2026
Viaarxiv icon

Beyond Binary: Reframing GUI Critique as Continuous Semantic Alignment

Add code
May 14, 2026
Viaarxiv icon

PROVE: A Perceptual RemOVal cohErence Benchmark for Visual Media

Add code
May 14, 2026
Viaarxiv icon

How Mobile World Model Guides GUI Agents?

Add code
May 11, 2026
Viaarxiv icon

Listening with Time: Precise Temporal Awareness for Long-Form Audio Understanding

Add code
Apr 24, 2026
Viaarxiv icon

TTS-PRISM: A Perceptual Reasoning and Interpretable Speech Model for Fine-Grained Diagnosis

Add code
Apr 24, 2026
Viaarxiv icon

ControlFoley: Unified and Controllable Video-to-Audio Generation with Cross-Modal Conflict Handling

Add code
Apr 16, 2026
Viaarxiv icon